Show code
#install libraries
library(ggmap)
library(dplyr)
library(stplanr)
library(osmdata)
library(sf)
library(tidyverse)
library(leaflet)
library(purrr)
library(kableExtra)
library(scales)Routes in Queens, New York City, NY
Alan Vlakancic
November 30, 2025
This project uses stplanr transport modeling package to design an optimal transport route for bus or cycle routes in New York City. Stplanr is a transport planning visualization R package that can be used to plan transit networks, in addition to transit planning elements such as identifying transit catchment areas, origin/destination data and ride frequency visualizations, among others. In their white paper, the devlopers of stplanr call for an accountable, transparent and democratized transit planning system that doesn’t rely on proprietary and often vastly different data sources and data processing softwares. Although the package can visualize a whole host of data, this project will focus on comparing direct desire lines or “Euclidean” routes (as the crow flies), existing bus networks and stplanr’s optimized routes. To wit: this can map the efficiency of the bus routes compared to the most direct route possible if there were no built environment factors in the way.
To adequately compare desire lines, bus routes, and the most efficient routes with the current street network, this project will require at minimum three data sources. Each of these will be sourced separately and overlaid onto each other:
#SHAPEFILES AND MAP
bbox <- c(left = -73.96, bottom = 40.54, right = -73.70, top = 40.81)
#create bounding box for NYC
nyc_map <- "data/"#
#nyc basemap, downloaded from nyc open data. source: https://search.r-project.org/CRAN/refmans/ptools/html/nyc_bor.html
nyc_sf <- st_read(nyc_map, quiet =TRUE)
#bring in nyc_map as a sf
shelters_sf <- read_csv("data/Bus_Stop_Shelter_20251020.csv")
#NOTE: REPLACE WITH DATA WHEN USING QUARTO!
#this brings in the bus stop shelter information. source: https://data.cityofnewyork.us/Transportation/Bus-Stop-Shelters/qafz-7myz
shelters_sf_fix <- st_as_sf(shelters_sf, coords = c("Longitude","Latitude"), crs = 4326)
#convert to sf object with the correct coordinate reference system
osmdata::set_overpass_url("https://overpass-api.de/api/interpreter")
#set overpass url for open street maps
osm_data <- opq(bbox = bbox) %>%
add_osm_feature(key = "highway", value = c("primary","secondary")) %>%
osmdata_sf()
#import data for primary and secondary highways from open street maps# STPLANR
shelters_sf_fix <- shelters_sf_fix %>%
mutate(id = paste0("S", row_number()))
#add ID column for origin-destination pairs so they have a corresponding number
flow_all <- expand.grid(o = shelters_sf_fix$id, #create origin
d = shelters_sf_fix$id, #create destination
stringsAsFactors = FALSE) %>% #make sure they aren't factors
filter(o != d) %>% # this remove self-pairs so O is not D
mutate(trips = 1) %>% #add trip count of 1 for each pair
sample_n(50) #sample 50 random paris to avoid blowing up the computer
desire_lines_all <- od2line(flow_all, zones = shelters_sf_fix, zone_code = "id") #use od2line function to create desire lines (euclidean) for all pairs
shelter_coords <- shelters_sf_fix %>%
st_coordinates() %>%
as.data.frame() %>%
bind_cols(id = shelters_sf_fix$id)
#extract coordinates and bind with ID column
route_single <- function(o_id, d_id) { #function to create a single route between origin and destination
o <- shelter_coords %>% filter(id == o_id)
#filter to get origin coordinates
d <- shelter_coords %>% filter(id == d_id)
#filter to get destination coordinates
r <- try(route_osrm(from = c(o$X, o$Y),
to = c(d$X, d$Y)), silent = TRUE)
#use try to catch errors (e.g., no route found)
if (inherits(r, "try-error")) return(NULL)
#if route found, return the route
return(r)
}
routes_list <- purrr::map2(flow_all$o, flow_all$d, route_single)
#create routes for all origin-destination pairs using the route_single function
routes_list <- routes_list[!sapply(routes_list, is.null)]
#remove any NULL results (failed routes)
routes_sf <- do.call(rbind, routes_list)
#combine all routes into a single sf object#BELOW: create a comparison between route lengths and desire line lengths & calculate means and percent change
routes_projection <- st_transform(routes_sf, 32618)
desire_projection <- st_transform(desire_lines_all, 32618)
#ensures the correct, projected shapefile for computation not mapping
route_length <- st_length(routes_projection)
desire_length <- st_length(desire_projection)
#compute lengths
route_length <- as.numeric(route_length)
desire_length <- as.numeric(desire_length)
#convert lengths to numeric values
lengths_tbl <- tibble(
route_m = route_length,
desire_m = desire_length,
origin = flow_all$o,
destination = flow_all$d
)
#create tibble to compare lengths in the final map w/ IDs
lengths_tbl_print <- tibble(
route_m = comma(round(route_length)),
desire_m = comma(round(desire_length)),
origin = flow_all$o,
destination = flow_all$d
)
#tidy data for later printing in a kable
mean_route <- mean(route_length, na.rm = TRUE)
mean_desire <- mean(desire_length, na.rm = TRUE)
#calculate means for both route and desire lengths
percent_change <- ((mean_route - mean_desire) / mean_desire) * 100
#calculate percent change
mean_lengths <- data.frame(
type = c("Route Length", "Desire Line Length"),
mean_length_m = c(round(mean_route), round(mean_desire)))
#put these into a data frame, rounded to whole numbers
mean_lengths <- mean_lengths %>%
mutate(
mean_length_km = mean_length_m / 1000,
percent_change = c(percent_change, NA)
)
#add km conversion and percent change to the data frame, i converted to KM for ease of computation (ie: dividing by 1,000)
mean_lengths_print <- mean_lengths %>%
mutate(
mean_length_km = comma(round(mean_length_m / 1000)),
percent_change = comma(round(percent_change))
)
#tidy data for later printing in a kable#LEAFLET PREP
nyc_leaflet <- st_transform(nyc_sf, 4326)
roads_leaflet <- st_transform(osm_data$osm_lines, 4326)
desire_leaflet <- st_transform(desire_lines_all, 4326)
routes_leaflet <- st_transform(routes_sf, 4326)
#transform all data to WGS84 for leaflet mapping
desire_leaflet_popup <- paste0(
"<b>Desire Line</b><br/>",
"Origin: ", desire_leaflet$o, "<br/>",
"Destination: ", desire_leaflet$d, "<br/>",
"Desire Line Distance: ", round(lengths_tbl$desire_m / 1000), " km"
)
#create popup info for desire lines for interactive map
routes_leaflet_popup <- paste0(
"<b>OSRM Route</b><br/>",
"Origin: ", desire_leaflet$o, "<br/>",
"Destination: ", desire_leaflet$d, "<br/>",
"Route Distance: ", round(lengths_tbl$route_m / 1000), " km<br/>"
)
#create popup info for routes for interactive map
pal_desire <- colorNumeric(
palette = "viridis",
domain = lengths_tbl$desire_m
)
#create color palette for desire lines based on distance
pal_routes <- colorNumeric(
palette = "inferno",
domain = lengths_tbl$route_m
)
#create color palette for routes based on distance
selected_ids <- unique(c(flow_all$o, flow_all$d))
#get unique IDs of sampled shelters
selected_shelters <- shelters_sf_fix %>%
filter(id %in% selected_ids)
#filter shelters to only those that were sampled#LEAFLET MAP
leaflet() %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(data = nyc_leaflet,
color = "black", weight = 3,
fillOpacity = 0.1,
group = "NYC Boundary") %>%
# Add OSM roads w/ positron background
addPolylines(
data = desire_leaflet,
color = pal_desire(lengths_tbl$desire_m),
weight = 3,
opacity = 0.7,
popup = desire_leaflet_popup,
group = "Desire Lines"
) %>%
addPolylines(
data = routes_leaflet,
color = pal_routes(lengths_tbl$route_m),
weight = 3,
opacity = 0.8,
popup = routes_leaflet_popup,
group = "OSRM Routes"
) %>%
addLegend(
pal = pal_desire,
values = lengths_tbl$desire_m,
title = "Desire Line Distance (m)",
position = "bottomright",
group = "Desire Lines"
) %>%
#add a legend for desire lines
addLegend(
pal = pal_routes,
values = lengths_tbl$route_m,
title = "OSRM Route Distance (m)",
position = "bottomleft",
group = "OSRM Routes"
) %>%
#add a legend for osrm routes
addCircleMarkers(data = shelters_sf_fix,
color = "blue",
radius = 1,
popup = ~id,
group = "Bus Shelters") %>%
#add all bus shelters
addCircleMarkers(
data = selected_shelters,
color = "purple",
radius = 6,
fillOpacity = 0.9,
group = "Sampled Shelters"
) %>%
#add sampled bus shelters with purple markers
addLayersControl(
overlayGroups = c("NYC Boundary", "Sampled Shelters",
"Desire Lines", "OSRM Routes",
"Bus Shelters"),
options = layersControlOptions(collapsed = FALSE)
)| Type | Mean Length (m) | Mean Length (km) | Percent Change (%) |
|---|---|---|---|
| Route Length | 17860 | 18 | 21 |
| Desire Line Length | 14820 | 15 | NA |
| Route Length (m) | Desire Line Length (m) | Origin ID | Destination ID |
|---|---|---|---|
| 27,014 | 24,234 | S656 | S2356 |
| 15,823 | 13,618 | S247 | S1792 |
| 7,740 | 7,128 | S845 | S1300 |
| 31,900 | 30,500 | S3219 | S1057 |
| 17,033 | 13,924 | S615 | S2355 |
| 14,802 | 12,871 | S2432 | S1517 |
| 28,293 | 20,968 | S1155 | S2261 |
| 12,050 | 10,704 | S125 | S470 |
| 7,296 | 5,190 | S2228 | S1827 |
| 6,300 | 4,986 | S319 | S572 |
| 10,956 | 8,804 | S1468 | S162 |
| 25,778 | 20,727 | S1208 | S700 |
| 5,203 | 4,417 | S1991 | S1522 |
| 20,473 | 18,374 | S907 | S1672 |
| 6,942 | 5,491 | S1053 | S1232 |
| 5,331 | 4,110 | S1028 | S1223 |
| 24,112 | 22,097 | S3227 | S1516 |
| 9,347 | 6,096 | S839 | S2205 |
| 16,067 | 13,753 | S2705 | S788 |
| 21,181 | 18,646 | S1551 | S27 |
| 2,595 | 2,268 | S750 | S739 |
| 23,943 | 20,683 | S1867 | S2802 |
| 6,937 | 5,778 | S1740 | S2082 |
| 18,390 | 17,004 | S2931 | S2101 |
| 30,206 | 6,145 | S29 | S3195 |
| 43,611 | 39,475 | S1381 | S3254 |
| 21,297 | 18,497 | S1478 | S2860 |
| 18,180 | 17,012 | S31 | S1686 |
| 36,747 | 31,279 | S862 | S926 |
| 19,801 | 17,714 | S2895 | S817 |
| 14,537 | 13,511 | S546 | S746 |
| 28,727 | 21,128 | S947 | S3164 |
| 28,059 | 22,398 | S595 | S2455 |
| 6,403 | 5,388 | S2133 | S96 |
| 795 | 776 | S2988 | S2329 |
| 16,704 | 15,239 | S60 | S2700 |
| 14,380 | 11,593 | S964 | S2588 |
| 6,159 | 4,483 | S1809 | S2896 |
| 29,767 | 25,151 | S700 | S1073 |
| 23,550 | 21,978 | S2165 | S3269 |
| 11,918 | 8,643 | S2206 | S1416 |
| 40,755 | 33,680 | S3248 | S3028 |
| 15,243 | 13,084 | S2318 | S17 |
| 1,654 | 975 | S1883 | S1350 |
| 7,441 | 6,849 | S267 | S39 |
| 48,625 | 41,065 | S3152 | S3367 |
| 12,063 | 10,613 | S1738 | S2123 |
| 25,513 | 21,091 | S545 | S1212 |
| 12,146 | 10,088 | S2664 | S227 |
| 13,195 | 10,752 | S1876 | S2559 |
The mean route length for the optimized routes for this particular sample run is 17.86 km, while the mean Euclidean desire line length is 14.82 km. This represents a percent change of 20.51% longer for the optimized routes compared to the direct desire lines. The interactive map above visualizes these routes, with desire lines colored based on their lengths and Open Street Routing Machine (OSRM) routes similarly colored with a different theme.
The results indicate that the optimized OSRM routes are significantly longer than the direct desire lines, which is generally expected given the constraints of the built environment and road network. The percent change of 20.51% suggests that while the desire lines represent the most direct path between two points, real-world travel must navigate around obstacles, follow roadways, and adhere to traffic regulations etc.
Future research could expand the sample size to include more origin-destination pairs, or even all bus stops. Incorporating real-world travel time data, traffic patterns, and transit schedules could provide a more comprehensive understanding of route efficiency. Further analysis could also explore the impact of different modes of transportation, such as cycling or walking, on route optimization and efficiency.
https://www.mta.info/agency/new-york-city-transit/subway-bus-facts-2019 https://docs.ropensci.org/stplanr/